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ORIGINAL ARTICLE 



Identification of a genetic locus on chromosome 
4q34-35 for type 2 diabetes with overweight 

Mi-Hyun Park 1,2 , Soo Heon Kwak 3 , Kwang Joong Kim 1 , Min Jin Go 1 , Hye-Ja Lee 1 , Kyung-Seon Kim 1 , 
Joo-Yeon Hwang 1 , Kuchan Kimm 1 , Young-Min Cho 3 , Hong Kyu Lee 3 , Kyong Soo Park 3 ' 4 and 
Jong- Young Lee 1 

The incidence of type 2 diabetes is rising rapidly because of an increase in the incidence of being overweight and obesity. 
Identification of genetic determinants for complex diseases, such as type 2 diabetes, may provide insight into disease 
pathogenesis. The aim of the study was to investigate the shared genetic factors that predispose individuals to being overweight 
and developing type 2 diabetes. We conducted genome-wide linkage analyses for type 2 diabetes in 386 affected individuals 
(269 sibpairs) from 171 Korean families and association analyses with single-nucleotide polymorphisms of candidate genes 
within linkage regions to identify genetic variants that predispose individuals to being overweight and developing type 2 
diabetes. Through fine-mapping analysis of chromosome 4q34-35, we detected a locus potentially linked (nonparametric 
linkage 2.81, logarithm of odds 2.27, P=6 x 10 ~ 4 ) to type 2 diabetes in overweight or obese individuals (body mass index, 
BMI^23 kgm 2 ). Multiple regression analysis with type 2 diabetes-related phenotypes revealed a significant association 
(false discovery rate (FDR) P= 0.006 for rsl3144140; FDR P= 0.002 for rs6830266) between GPM6A (rsl3144140) and 
BMI and waist-hip ratio, and between NEIL3 (rs6830266) and insulin level from 1314 normal individuals. Our systematic 
search of genome-wide linkage and association studies, demonstrate that a linkage peak for type 2 diabetes on chromosome 
4q34-35 contains two type 2 diabetes-related genes, GPM6A and NEIL3. 
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INTRODUCTION 

Type 2 diabetes mellitus is one of the most common metabolic 
disorders in the world. According to the International Diabetes 
Federation estimates, nearly 194 million people had type 2 
diabetes in 2003, and this number is expected to increase to 
333 million by 2025. 1 Although the pathogenesis of type 2 
diabetes is not completely understood, it is well established 
that the disease is a consequence of complex interactions 
between both genetic and environmental factors. The quest to 
elucidate the genetic causes of type 2 diabetes was advanced 
with the recent advent of genome-wide association studies, 
from which nearly 25 new genetic loci robustly associated with 



type 2 diabetes risk have been described. 2-9 However, it is likely 
that additional type 2 diabetes risk genes are yet to be 
discovered. 

A complementary approach to genome-wide association 
studies for discovering susceptibility genes is genome-wide 
linkage analysis, which has relatively more power for identify- 
ing rare high-risk disease alleles. 10 This approach uses affected 
sibpairs (ASPs), nuclear families and multigenerational 
kindred to define chromosomal loci that contain candidate 
disease genes. More than 50 genome-wide linkage studies 
have demonstrated the presence of different type 2 diabetes 
susceptibility loci in different ethnic groups or 
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endophenotypes. 11 The usual paradigm for exploiting 
genome-wide linkage analysis is to focus on chromosomal 
regions having an established disease linkage and perform a 
fine-mapping and candidate gene study. 

Among the various environmental risk factors for type 2 
diabetes, obesity is considered to be the most important 
determinant. 12 Obesity is influenced by increased caloric 
intake and physical inactivity, but individual susceptibility 
also has a genetic basis. Recent studies have revealed genes, 
such as FTO, associated with both diabetes and obesity, 13 
which may increase the risk of type 2 diabetes by 
modulating body mass index (BMI). Obesity and type 2 
diabetes have complex interactions, and therefore, evaluation 
of the genetic risk factors according to BMI subgroups could 
be valuable. Furthermore, dividing the subjects into similar 
BMI subgroups would increase the chance of detecting a 
positive interaction. 

In this study, we report the results of the first genome-wide 
linkage study for type 2 diabetes in the Korean population, 
followed by fine mapping using subgroup analysis according to 
BMI to increase the probability of identifying type 2 diabetes - 
linked genetic loci. 

METHODS 

Subjects 

For the linkage study, nuclear Korean families with at least two 
siblings with type 2 diabetes were recruited from Seoul National 
University Hospital (SNUH). A total of 386 affected individuals (269 
ASPs in 171 families) were considered for the study; parents and other 
normal siblings were not included. All subjects enrolled in this study 
were of Korean ethnicity. Diabetes was diagnosed based on the 
American Diabetes Association criteria. 14 

For the association study, 378 unrelated type 2 diabetes patients 
and 382 unrelated non-diabetic control subjects were recruited from 
SNUH. The diabetic subjects were randomly recruited from patients 
in the SNUH outpatient clinic. Non- diabetic control subjects were 
recruited from an unselected population undergoing a routine health 
checkup at SNUH. Subjects' height, weight, waist and hip circumfer- 
ences were measured. Fasting plasma glucose and plasma insulin 
concentrations were measured. For the replication study, 949 unre- 
lated type 2 diabetes subjects and 932 control subjects were selected 
from projects of the Korean Health and Genome Study (KHGS). 15 A 
total of 932 participants in the cohorts who had no history of 
diabetes, and no first-degree relatives with diabetes were recruited as 
normal control subjects. In addition, the normal control subjects were 
not taking medication for diabetes, hypertension or dyslipidemia. 

The Institutional Review Boards of the Clinical Research Institute 
at the SNUH and the Korean National Institute of Health approved 
the study protocol, and written informed consent was obtained from 
each subject. Table 1 presents the clinical characteristics of the subjects 
in the linkage and association studies. 

Genotyping of microsatellite markers 

Genotyping was performed using the fluorescently labeled human 
ABI Prism Linkage Mapping Set MD-10 (Applied Biosystems, Foster 
City, CA, USA), comprising 400 informative microsatellite markers 
with an average intermarker spacing of 9.7 cM. Each marker set 
included a fluorescently labeled forward primer and a tailing reverse 
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primer. All markers were dinucleotide repeats of the type (CA) n , 
originally chosen from the Genethon/CEPH map. 16 

PCR (95 °C for 2 min, then 35 cycles of 95 °C for 20 s, 58 °C for 40 s 
and 72 °C for 30 s, followed by a final 72 °C for 45 min) was performed 
in a 384- well plate on a GeneAmp PCR system (9700 Biblock; Applied 
Biosystems) in 5|il reactions containing: 10-20 ng genomic DNA, 
2.5mmoll _1 MgCl 2 , 0.25 mmoll -1 dNTPs, lpmol of primer and 
0.2 U EF-Taq DNA polymerase in 1 x PCR buffer (SolGent, Daejeon, 
Korea). Automated 96-channel and 384-channel pipettes (Hydra I, II; 
Matrix Technologies, Hudson, NH, USA) were used for the pipetting 
steps. Electrophoresis and signal recording were performed on ABI 
3730 automated sequencers (Applied Biosystems) using standard 
protocols. We used GeneScan 500 Liz (Applied Biosystems) as the 
internal size standard because it assists polymorphic fragment length 
calling and allows more accurate allele calling and unambiguous 
comparison of data across experimental conditions. 

Genotypes were initially scored using GENEMAPPER 3.7 software, 
and were reviewed independently to confirm the accuracy of allele 
calling. All genotyped markers were checked for incompatibilities 
using the program MARKERINFO from the SAGE (statistical analysis 
for genetic epidemiology) package. Database-aided quality-control 
procedures included confirmation of standard individual genotypes 
(CEPH standard 1347-02; Coriell Institute for Medical Research, 
Camden, NJ, USA), plate identity, orientation and allele size. 

Linkage analyses and linkage programs 

For ASP linkage analysis screening, nonparametric multipoint ana- 
lyses of all autosomal chromosomes were performed with the 
programs GENEHUNTER v2.1 and Merlin vl.0.1 considering all 
the pairs as independent. For the X chromosome, the multipoint 
nonparametric linkage (NPL) score was calculated using GENEHUN- 
TER Plus, which implemented both the NPL Z-score and the allele - 
sharing logarithm of odds (LOD). 17 ' 18 Allele frequencies for the 400 
markers were estimated based on the entire data set using Recode 
(Division of Statistical Genetics, Department of Human Genetics, 
University of Pittsburgh, Pittsburgh, PA, USA). Marker order and 
intermarker distances were taken from published Genethon/CEPH 
maps. 16 Parental genotypes were missing, estimated probabilities were 
calculated for all possible parental genotypes conditioned on the 
sibship genotype information. 

We also used the SAGE package, which is based on methods 
proposed by Haseman and Elston. 19 Allele frequencies for the markers 
were estimated using the SAGE program FREQ. Pedigree relationships 
were tested using the RELTEST program. Alleles shared identical by 
descent (IBD) among independent sibpairs were calculated using the 
GENIBD program. For linkage testing, the mean proportion of alleles 
shared IBD among ASPs (n) was estimated and tested against the null 
hypothesis of no linkage (71 = 0.5) (HO: mean IBD sharing (mean of 
the 7ii) = Vi(0 x P m + 0.5 x P (fl) + 1 x P (f2) ), HA: mean IBD sharing 
>Vi) y with significant excess of sharing taken as evidence for linkage 
by the SIBPAL program of the SAGE software package. 

To explore possible heterogeneity in our data set and to overcome 
any associated reduction in power, subgroup analysis was performed 
according to BMI. We used a BMI of 23kgm~ 2 as a cutoff value for 
being overweight (including obesity), and both siblings who had a 
BMI ^23 kg m~ 2 were included for subgroup analysis. 

Association studies 

Single-nucleotide polymorphisms (SNPs) were chosen from gene- 
centric SNPs in linkage region 4q34-35; 175 924 556-185 473 235, 



and included SNPs located directly within coding, promoter, non- 
coding and intron regions. 

Genotyping was performed using the Illumina's Golden Gate 
genotyping system 20 (Illumina, San Diego, CA, USA) and data 
quality was assessed using duplicate DNAs. All data considered for 
genotyping analysis had a genotype quality score ^0.25. SNPs that 
did not satisfy the following criteria were excluded: (i) a minimum 
call rate of 90%; (ii) no duplicate errors; (iii) Hardy-Weinberg 
equilibrium P value > 0.001. In total, 201 SNPs (SNUH: 166, KHGS: 
66, overlap: 31) were used for association analysis. Associations 
between SNPs and type 2 diabetes- related pheno types among normal 
subjects were determined by linear regression analysis while control- 
ling for age, gender and BMI. Association was tested using co- 
dominant, dominant and recessive models. Statistical analyses were 
performed using SAS software (SAS institute, Cary, NC, USA). 

RESULTS 

Genome-wide linkage analysis in the whole study group 

Whole- genome linkage analysis using ASPs with 400 micro - 
satellite markers was performed on 269 Korean ASPs (171 
families) with type 2 diabetes. The observed z-scores of 
multipoint NPL scores from the genome-wide scan are shown 
in Supplementary Figure SI. The maximum score was on 
chromosome 4, with a peak multipoint NPL score of 1.52 
(P = 0.06) at marker D4S415 in a region encompassing 185 cM. 

Subgroup analysis of individuals with BMI ^ 23 kg m 2 

To assess a possible interaction between BMI and diabetes, we 
carried out subgroup analysis with a BMI cutoff of 23kgm -2 . 
The BMI < 23 kg m -2 group had a multipoint NPL score of 
2.13 for 18q22 (Supplementary Figure S2). The 
BMI ^ 23 kg m -2 group had a multipoint NPL score of 1.73 
for lq31 and 2.24 for 4q34 (Figure 1). 

The 4q34 region in the BMI ^ 23 kg m -2 group had the 
highest NPL score of all the regions examined. To confirm this 
linkage that had been analyzed with GENEHUNTER v2.1, we 
again analyzed chromosome 4 (Subset: BMI ^ 23 kg m -2 ) with 
132 ASPs (101 families) using the SAGE software package. 
Among the 22 markers on chromosome 4, D4S1539, D4S415 
and D4S1535 passed the statistical threshold for linkage. 
Marker D4S415 showed an especially significant result with 
the mean proportion of alleles sharing an IBD score of 0.56 
with P = 0.007 (Supplementary Table SI). 

We also carried out a linkage analysis for BMI on chromo- 
some 4 using the traditional Haseman-Elston procedure 
included in the SAGE package. Again, we found evidence for 
linkage at marker D4S415 on 4q34 with P = 0.009 (data not 
shown). 

Fine mapping of the chromosome 4q linkage region 

For high-resolution mapping, 132 ASPs in the 
BMI ^ 23 kg m -2 group were genotyped by adding eight 
microsatellite markers (D4S3033, D4S2952, D4S2979, 
D4S1595, D4S2991, D4S1607, D4S3015 and D4S2920) that 
covered the candidate regions on chromosome 4q. Multipoint 
NPL analysis showed a significantly increased Z value, with an 
NPL score corresponding to marker D4S3015 of 2.81 (LOD 
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Figure 1 Multipoint NPL score maps of all autosomal chromosomes in the BMI ^23 kgm ~ 2 group with 132 ASPs (101 families). 
Multipoint NPL scores were calculated using GENEHUNTER v2.1. In each graph, the left vertical axis indicates the NPL score, the 
horizontal axis indicates the length of each chromosome, and the locus numbers on the horizontal axis indicate the positions of the 
microsatellite markers. 



2.27, P = 6 x 10 ~ 4 ) at 188.8 cM (Figure 2). We also examined 
the P-values of the mean proportion test calculated by the 
SIBPAL program. Among the 30 markers of chromosome 4 
including the additional eight markers listed above, D4S415, 
D4S1607, D4S3015, D4S2920 and D4S1535 passed the statis- 
tical threshold for linkage. The D4S3015 marker showed an 
especially significant result with the mean proportion of alleles 
sharing an IBD score of 0.58 with P = 0.001 (Supplementary 
Table S2). 

The one LOD drop' support interval was identified as a 
9.5-Mb region between markers D4S1539 and D4S1535, with 
an NPL score > 1.81. 

Association studies in the chromosome 4q34-35 linkage 
region 

To identify candidate genes predisposing individuals to type 2 
diabetes, we examined possible associations between type 2 
diabetes and SNPs from 23 genes in the 9.5-Mb 4q34-35 
linkage region. In the first association study, we analyzed the 
association of 166 SNPs with type 2 diabetes in 411 SNUH 



subjects (282 cases, 129 controls) with BMI^23kgm -2 . 
In the logistic analysis, several SNPs in the gene encoding 
glycoprotein M6a (GPM6A) showed marginal association with 
type 2 diabetes (Supplementary Table S3). We also analyzed 
the association of 66 SNPs with type 2 diabetes in 1326 KHGS 
subjects (835 cases, 491 controls) with BMI^23kgm -2 . 
In the logistic analysis, several SNPs in the gene encoding 
Nei endonuclease VHI-like 3 (NEIL3) showed marginal 
association with type 2 diabetes (Supplementary Table S3). 
We next conducted a combined analysis with 1737 SNUH and 
KHGS individuals with BMI^23kgm~ 2 . Among the 31 
SNPs, two SNPs in NEIL3, rsl 1940019 and rsl7676249, 
showed an odds ratio of 4.80 (P = 0.02) using the recessive 
model. In the case of GPM6A, haplotype analysis 
(rs23332251(T)/rs7675676(G)) showed an association 
(P = 0.008) with type 2 diabetes (Supplementary Table S3). 

We next performed quantitative trait analysis by assessing 
the association of GPM6A and NEIL3 with type 2 diabetes- 
related phenotypes in normal control subjects. Results for the 
multiple regression analysis of association between GPM6A 
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Figure 2 Multipoint NPL score map of chromosome 4. The whole and BMI ^23 kgm~ 2 groups were analyzed using 22 markers at 
9.5 cM resolution, and BMI ^ 23 kg m~ 2 fine mapping was analyzed with 30 markers and an additional eight markers in the 
40.6-cM interval at a resolution of 157.9 to 198. 5cM. Multipoint NPL scores were calculated using GENEHUNTER v2.1. In the graph, 
the left vertical axis indicates the NPL score, and the horizontal axis indicates the markers of chromosome 4. The additional eight 
markers are shown in red. 



and NEIL3, and BMI, waist-hip ratio (WHR), and fasting 
insulin level are shown in Table 2. The GPM6A SNPs were 
associated with BMI and WHR, and the NEIL3 SNPs were 
associated with fasting insulin level in 382 SNUH and 932 
KHGS subjects. In the combined 1314 normal subjects 
(SNUH + KHGS), the rsl3152426 and rsl3144140 SNPs in 
GPM6A were significantly associated with BMI (P = 0.0004, 
false discovery rate (FDR) 0.0062) and WHR (P = 0.0007, FDR 
0.0109). The rs6850861, rs6823018, rs6830266 and rs2048077 
SNPs in NEIL3 were significantly associated (P= 0.0003, FDR 
0.0023) with fasting insulin level. However, the fasting glucose 
level was not significantly associated with any GPM6A or 
NEIL3 SNPs (data not shown). 

DISCUSSION 

We report here the first genome-wide search for chromosome 
loci associated with type 2 diabetes susceptibility in 
Korean subjects. Our results reveal evidence of linkage 
at the 4q34-35 locus in subjects with BMI ^ 23 kg m -2 . The 
study is comprehensive because we performed genome- 
wide linkage analysis, fine mapping and association analysis 
with quantitative traits. Another strength of our study is 
that the ethnicity of the Korean population is relatively 
homogeneous, resulting in a higher probability of identifying 
diabetes -linked loci. 

There have been more than 50 type 2 diabetes linkage 
analysis studies conducted in various populations, but few loci 



with strong evidence for linkage have been replicated. 11 The 
4q34-35 region showed a replicated linkage signal that was 
reported in several populations. Significant evidence for 
linkage has been obtained for marker D4S1501 on 4q34 in 
Ashkenazi Jewish individuals, 21 and modest evidence for 
linkage between type 2 diabetes and chromosome 4q34-q35 
was detected in Finnish families. 22 In addition, the 4q34 region 
contains susceptibility loci in French whites. 23 In the French 
study, the loci were detected when subjects were subdivided 
according to a BMI of 27kgm -2 , which is similar to our 
approach. 

We used a BMI of 23kgm -2 as a cutoff value for being 
overweight. A WHO expert consultant considered whether a 
population-specific cutoff point for BMI was necessary, and 
concluded that a substantial proportion of the Asian popula- 
tion is at high risk for type 2 diabetes and that many Asians 
have BMIs lower than the existing WHO cutoff point for being 
overweight (^25kgm -2 ) compared with Caucasians (in 
general) or European populations. 24 Another WHO report 
indicated that Asian adults with a BMI > 23.0 should be 
considered overweight. 25 

We found evidence of linkage for type 2 diabetes from 
subgroups with BMI ^ 23 kg m -2 . These results indicate an 
interaction between susceptibility loci and obesity. Moreover, 
subgrouping by BMI may have increased our chance of 
discovering risk loci. Conversely, subgrouping could lead to 
false-positive results. Because we confirmed our results in two 
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replication sets and performed a quantitative trait analysis, 
however, the possibility of false positives seems low. 

GPM6A (glycoprotein M6A) acts as a stress- response gene 
during hippocampal formation. The relationship between 
GPM6A, type 2 diabetes and being overweight is unknown. 
However, a small leucine- rich glycoprotein called decorin was 
reported to be associated with type 2 diabetes and obesity, 
possibly by upregulating the expression of decorin in adipose 
tissue in type 2 diabetes subjects. 26 To assess the putative 
function of the GPM6A variant, we analyzed the effect of the 
rsl3144140 SNP (intronl, NM_201591) using an in silico 
approach. The FASTSNP program allows users to efficiently 
identify and prioritize high-risk SNPs according to their 
phenotypic risks and putative functional effects. 27 The 
analysis of rsl 3 144140 with FASTSNP revealed that it is 
predicted to be a functional change in the protein by 
causing a change in a transcription factor binding sites. 
Using TRANSFAC, 28 we found that rsl3144140 correlated 
with binding of HNF-1, a transcription factor that controls 
multiple genes implicated in pancreatic [3-cell function. 29 
These observations were further supported by matching the 
HNF-1 DNA binding domain sequence (TGCAAATCAT 
TTTC) using the Transcriptional Regulatory Element Data- 
base. 30 This intronic variant (rs 13 144140) could be an internal 
enhancer element that regulates GPM6A expression in an 
adipose tissue-specific manner. 

NEIL3 belongs to a class of DNA glycosylases homologous 
to the bacterial Fpg/Nei family. These glycosylases initiate the 
first step in base excision repair by cleaving bases damaged by 
reactive oxygen species and introducing a DNA strand break 
via the associated lyase reaction. 31 Three human genes, 
designated NEIL1, NEIL2 and NEIL3, encode proteins that 
contain sequence homologies to Fpg and Nei. 32 Deletion of 
NEIL1 results in a metabolic syndrome. In the absence of 
exogenous oxidative stress, neill knockout {neill~ l ~) and 
heterozygous (neill +l ~) mice develop severe obesity, 
dyslipidemia and fatty liver disease, and also have a tendency 
to develop hyperinsulinemia. 33 Although the role of NEIL3 in 
type 2 diabetes or obesity is not yet known, it may have a 
similar function as NEILl. 

In summary, we have identified GPM6A and NEIL3 as being 
associated with overweight and type 2 diabetes using a 
systematic search through genome-wide linkage and linkage 
region-based association analyses in Koreans. The chromoso- 
mal locus 4q34-35 was linked to type 2 diabetes in subjects 
with BMI ^ 23 kg m -2 . The genes located in this region were 
associated with metabolic traits, such as insulin level, BMI and 
WHR. Further studies of these two genes in different popula- 
tions are required. 
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